AITopics | effective number

In this work, we analyze alternative effective sample size (ESS) metrics for importance sampling algorithms, and discuss a possible extended range of applications. We show the relationship between the ESS expressions used in the literature and two entropy families, the Rényi and Tsallis entropy. The Rényi entropy is connected to the Huggins-Roy's ESS family introduced in \cite{Huggins15}. We prove that that all the ESS functions included in the Huggins-Roy's family fulfill all the desirable theoretical conditions. We analyzed and remark the connections with several other fields, such as the Hill numbers introduced in ecology, the Gini inequality coefficient employed in economics, and the Gini impurity index used mainly in machine learning, to name a few. Finally, by numerical simulations, we study the performance of different ESS expressions contained in the previous ESS families in terms of approximation of the theoretical ESS definition, and show the application of ESS formulas in a variable selection problem.

artificial intelligence, ess-h, machine learning, (18 more...)

arXiv.org Machine Learning

doi: 10.1007/s00180-025-01665-8

2602.22954

Country:

Europe > Italy (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
Europe > United Kingdom > England (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback

aec5e2847c5ae90f939ab786774856cc-Paper-Conference.pdf

Neural Information Processing SystemsFeb-16-2026, 13:37:43 GMT

artificial intelligence, inductive learning, machine learning, (18 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States > Massachusetts > Middlesex County > Reading (0.04)
North America > Mexico > Yucatán > Mérida (0.04)
Asia > Pakistan (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.45)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.93)
Information Technology > Data Science (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.49)

Add feedback

5a33645b5c5b7a9882652526d30d0acc-Supplemental-Conference.pdf

Neural Information Processing SystemsFeb-12-2026, 05:40:28 GMT

approximation, artificial intelligence, machine learning, (17 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.95)

Add feedback

3ffebb08d23c609875d7177ee769a3e9-Paper.pdf

Neural Information Processing SystemsFeb-12-2026, 00:13:35 GMT

dataset, sparrow, weak rule, (16 more...)

Neural Information Processing Systems

Country:

Asia > Japan (0.14)
North America > United States > California > San Diego County > San Diego (0.04)
North America > United States > California > San Diego County > La Jolla (0.04)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.97)
Information Technology > Data Science > Data Mining (0.96)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)

Add feedback

On the Effective Number of Linear Regions in Shallow Univariate ReLU Networks: Convergence Guarantees and Implicit Bias

Neural Information Processing SystemsDec-25-2025, 09:11:53 GMT

We study the dynamics and implicit bias of gradient flow (GF) on univariate ReLU neural networks with a single hidden layer in a binary classification setting. We show that when the labels are determined by the sign of a target network with $r$ neurons, with high probability over the initialization of the network and the sampling of the dataset, GF converges in direction (suitably defined) to a network achieving perfect training accuracy and having at most $\mathcal{O}(r)$ linear regions, implying a generalization bound. Unlike many other results in the literature, under an additional assumption on the distribution of the data, our result holds even for mild over-parameterization, where the width is $\tilde{\mathcal{O}}(r)$ and independent of the sample size.

convergence guarantee and implicit bias, effective number, shallow univariate relu network, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.61)

Add feedback

aec5e2847c5ae90f939ab786774856cc-Paper-Conference.pdf

Neural Information Processing SystemsOct-9-2025, 04:49:10 GMT

artificial intelligence, inductive learning, machine learning, (18 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States > Massachusetts > Middlesex County > Reading (0.04)
North America > Mexico > Yucatán > Mérida (0.04)
Asia > Pakistan (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.45)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.93)
Information Technology > Data Science (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.49)

Add feedback

A Generalisation to other groups

Neural Information Processing SystemsOct-8-2025, 18:13:49 GMT

The residual pathway of Eq. (10) was originally in Finzi et al. [2021a] for general groups Similar to the original residual pathway paper [Finzi et al., 2021a], we could also write linear Further, we give an extension for group convolutional layers. First, we split up the definition from Eq. (5): y ( c For sparsified factored layers S-FC, we extend the K-FAC approximation for factored layers. Similarly to above, we use the existing derivation to approximate KFAC in terms of Jacobians w.r.t. Table 4: Analysis of learned layer types after training FC+CONV model on CIFAR-10. Prior precisions learned for the CONV+FC model are shown on the left in Table 4.

approximation, artificial intelligence, machine learning, (18 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Hyperparameter Loss Surfaces Are Simple Near their Optima

Lourie, Nicholas, He, He, Cho, Kyunghyun

arXiv.org Machine LearningOct-6-2025

Hyperparameters greatly impact models' capabilities; however, modern models are too large for extensive search. Instead, researchers design recipes that train well across scales based on their understanding of the hyperparameters. Despite this importance, few tools exist for understanding the hyperparameter loss surface. We discover novel structure in it and propose a new theory yielding such tools. The loss surface is complex, but as you approach the optimum simple structure emerges. It becomes characterized by a few basic features, like its effective dimension and the best possible loss. To uncover this asymptotic regime, we develop a novel technique based on random search. Within this regime, the best scores from random search take on a new distribution we discover. Its parameters are exactly the features defining the loss surface in the asymptotic regime. From these features, we derive a new asymptotic law for random search that can explain and extrapolate its convergence. These new tools enable new analyses, such as confidence intervals for the best possible performance or determining the effective number of hyperparameters. We make these tools available at https://github.com/nicholaslourie/opda .

asymptotic regime, hyperparameter, random search, (15 more...)

arXiv.org Machine Learning

2510.02721

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
North America > Mexico > Mexico City > Mexico City (0.04)
(2 more...)

Genre: Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)

Add feedback

3ffebb08d23c609875d7177ee769a3e9-Paper.pdf

Neural Information Processing SystemsOct-2-2025, 14:53:26 GMT

data mining, machine learning, weak rule, (19 more...)

Neural Information Processing Systems

Country: North America > United States > California > San Diego County (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.97)
Information Technology > Data Science > Data Mining (0.96)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)

Add feedback

GRAB: A Risk Taxonomy--Grounded Benchmark for Unsupervised Topic Discovery in Financial Disclosures

Li, Ying, Ma, Tiejun

arXiv.org Artificial IntelligenceSep-29-2025

Risk categorization in 10-K risk disclosures matters for oversight and investment, yet no public benchmark evaluates unsupervised topic models for this task. We present GRAB, a finance-specific benchmark with 1.61M sentences from 8,247 filings and span-grounded sentence labels produced without manual annotation by combining FinBERT token attention, YAKE keyphrase signals, and taxonomy-aware collocation matching. Labels are anchored in a risk taxonomy mapping 193 terms to 21 fine-grained types nested under five macro classes; the 21 types guide weak supervision, while evaluation is reported at the macro level. GRAB unifies evaluation with fixed dataset splits and robust metrics--Accuracy, Macro-F1, Topic BERTScore, and the entropy-based Effective Number of Topics. The dataset, labels, and code enable reproducible, standardized comparison across classical, embedding-based, neural, and hybrid topic models on financial disclosures.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2509.21698

Country: Europe (0.68)

Genre: Research Report (0.65)

Industry:

Law > Business Law (0.70)
Banking & Finance > Trading (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.56)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)

Add feedback

Filters

Collaborating Authors

effective number

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Effective sample size approximations as entropy measures

aec5e2847c5ae90f939ab786774856cc-Paper-Conference.pdf

5a33645b5c5b7a9882652526d30d0acc-Supplemental-Conference.pdf

3ffebb08d23c609875d7177ee769a3e9-Paper.pdf

On the Effective Number of Linear Regions in Shallow Univariate ReLU Networks: Convergence Guarantees and Implicit Bias

aec5e2847c5ae90f939ab786774856cc-Paper-Conference.pdf

A Generalisation to other groups

Hyperparameter Loss Surfaces Are Simple Near their Optima

3ffebb08d23c609875d7177ee769a3e9-Paper.pdf

GRAB: A Risk Taxonomy--Grounded Benchmark for Unsupervised Topic Discovery in Financial Disclosures